Ordinal Assignment

Authors

Dr. Tengku Muhammad Huzaifah bin Tengku Mokhtar

Dr. Muhammad Abdul Hafiz bin Kamarul Zaman

Dr. Muhammad Za’im bin Mohd Samsuri

Published

June 12, 2025

knitr::include_graphics("group.gif")

1 Introduction

The dataset contains 4,092 rows and 21 features collected from community-based health screening data. One of the key variables of interest is fasting blood sugar (FBS), which will be categorized into three clinical classes:

Normal (FBS < 5.6 mmol/L)

Prediabetes (FBS 5.6–6.9 mmol/L)

Diabetes (FBS ≥ 7.0 mmol/L)

The dataset includes a variety of demographic, anthropometric, and biochemical features such as:

Demographic data: age, gender, rural/urban locality

Lifestyle indicators: smoking status, hypertension history

Anthropometric measures: height, weight, waist and hip circumference

Blood pressure: mean systolic and diastolic blood pressure

Biochemical values: HbA1c, cholesterol profiles (HDL, LDL, triglycerides, total cholesterol), and oral glucose tolerance test (OGTT) values

This dataset is suitable for exploring predictors of abnormal fasting blood sugar levels using ordinal logistic regression.

2 Preparation

library(here)
library(tidyverse)
library(haven)
library(gtsummary)
library(VGAM)
library(nnet)
library(broom)
library(knitr)
library(kableExtra)
library(tibble)
library(purrr)
library(gt)
library(ggplot2)
library(ggeffects)
library(reshape2)
library(data.table)
library(ordinal)
library(foreign)
library(brant)
library(patchwork)

3 Read Data

datafbs.o <- read_csv("datamssm_b.csv")
summary(datafbs.o)
   codesub               age            hpt              smoking         
 Length:4092        Min.   :18.00   Length:4092        Length:4092       
 Class :character   1st Qu.:38.00   Class :character   Class :character  
 Mode  :character   Median :48.00   Mode  :character   Mode  :character  
                    Mean   :48.05                                        
                    3rd Qu.:58.00                                        
                    Max.   :89.00                                        
                                                                         
     dmdx               height          weight           waist       
 Length:4092        Min.   :1.270   Min.   : 30.00   Min.   : 50.80  
 Class :character   1st Qu.:1.510   1st Qu.: 53.80   1st Qu.: 77.00  
 Mode  :character   Median :1.560   Median : 62.30   Median : 86.00  
                    Mean   :1.568   Mean   : 63.92   Mean   : 86.44  
                    3rd Qu.:1.630   3rd Qu.: 72.00   3rd Qu.: 95.00  
                    Max.   :1.960   Max.   :187.80   Max.   :154.50  
                    NA's   :1       NA's   :1        NA's   :2       
      hip             msbpr           mdbpr            hba1c       
 Min.   : 61.00   Min.   : 68.5   Min.   : 41.50   Min.   : 0.200  
 1st Qu.: 91.00   1st Qu.:117.0   1st Qu.: 70.00   1st Qu.: 5.100  
 Median : 97.00   Median :130.0   Median : 78.00   Median : 5.400  
 Mean   : 97.92   Mean   :133.9   Mean   : 78.56   Mean   : 5.829  
 3rd Qu.:104.00   3rd Qu.:147.0   3rd Qu.: 86.00   3rd Qu.: 5.800  
 Max.   :160.00   Max.   :237.0   Max.   :128.50   Max.   :15.000  
 NA's   :2                                         NA's   :22      
      fbs            mogtt1h          mogtt2h          totchol      
 Min.   : 2.500   Min.   : 0.160   Min.   : 0.160   Min.   : 0.180  
 1st Qu.: 4.480   1st Qu.: 6.660   1st Qu.: 5.240   1st Qu.: 4.970  
 Median : 5.180   Median : 8.660   Median : 6.680   Median : 5.700  
 Mean   : 5.734   Mean   : 9.212   Mean   : 7.436   Mean   : 5.790  
 3rd Qu.: 6.020   3rd Qu.:10.910   3rd Qu.: 8.480   3rd Qu.: 6.525  
 Max.   :28.010   Max.   :41.500   Max.   :37.370   Max.   :23.140  
                  NA's   :495      NA's   :497      NA's   :5       
    ftrigliz           hdl             ldl            gender         
 Min.   : 0.110   Min.   :0.080   Min.   : 0.140   Length:4092       
 1st Qu.: 0.930   1st Qu.:1.110   1st Qu.: 2.800   Class :character  
 Median : 1.260   Median :1.320   Median : 3.460   Mode  :character  
 Mean   : 1.541   Mean   :1.349   Mean   : 3.544                     
 3rd Qu.: 1.780   3rd Qu.:1.550   3rd Qu.: 4.230                     
 Max.   :12.660   Max.   :4.430   Max.   :10.560                     
 NA's   :4        NA's   :3       NA's   :4                          
    crural         
 Length:4092       
 Class :character  
 Mode  :character  
                   
                   
                   
                   
glimpse(datafbs.o)
Rows: 4,092
Columns: 21
$ codesub  <chr> "AHA215459", "LAW215133", "SAL215736", "MJB216145", "TIS22632…
$ age      <dbl> 44, 64, 41, 34, 58, 39, 48, 41, 34, 22, 38, 47, 33, 20, 51, 4…
$ hpt      <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "…
$ smoking  <chr> "still smoking", "never smoked", "never smoked", "still smoki…
$ dmdx     <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "…
$ height   <dbl> 1.63, 1.59, 1.46, 1.52, 1.48, 1.53, 1.48, 1.52, 1.55, 1.65, 1…
$ weight   <dbl> 52.65, 55.30, 49.90, 48.70, 43.35, 60.10, 70.70, 63.65, 50.00…
$ waist    <dbl> 71.0, 83.5, 70.0, 75.0, 78.1, 83.0, 102.6, 90.0, 73.0, 75.0, …
$ hip      <dbl> 85.0, 89.9, 88.0, 86.7, 87.8, 93.5, 102.9, 104.0, 87.0, 90.0,…
$ msbpr    <dbl> 124.5, 157.5, 115.5, 111.0, 140.5, 160.0, 119.5, 150.0, 110.5…
$ mdbpr    <dbl> 73.5, 68.0, 64.5, 68.5, 68.5, 99.5, 76.0, 82.0, 73.5, 79.0, 7…
$ hba1c    <dbl> 4.2, 5.3, 5.0, 5.1, 5.2, 5.9, 5.5, 4.4, 5.1, 4.9, 5.3, 5.5, 5…
$ fbs      <dbl> 2.50, 2.50, 2.51, 2.52, 2.52, 2.52, 2.52, 2.52, 2.55, 2.55, 2…
$ mogtt1h  <dbl> 2.04, 9.99, 7.82, 5.67, 10.28, 11.44, 6.31, 9.00, 5.03, 4.30,…
$ mogtt2h  <dbl> 3.30, 8.44, 6.42, 5.45, 4.56, 6.95, 5.58, 7.07, 4.55, 5.34, 7…
$ totchol  <dbl> 4.18, 5.27, 5.10, 5.55, 6.72, 6.69, 7.28, 5.74, 3.48, 4.08, 4…
$ ftrigliz <dbl> 1.20, 1.28, 0.68, 0.86, 1.13, 1.59, 2.18, 1.24, 0.26, 0.96, 0…
$ hdl      <dbl> 1.07, 0.99, 1.47, 1.49, 1.46, 1.28, 1.20, 1.22, 1.04, 0.81, 1…
$ ldl      <dbl> 2.43, 3.64, 2.73, 3.56, 4.54, 4.08, 5.02, 3.70, 2.01, 2.34, 2…
$ gender   <chr> "male", "male", "female", "male", "female", "female", "female…
$ crural   <chr> "rural", "rural", "rural", "rural", "urban", "rural", "rural"…

4 Convert character to factor

datafbs.o <- datafbs.o %>%
  mutate(across(where(is.character), as.factor))
glimpse(datafbs.o)
Rows: 4,092
Columns: 21
$ codesub  <fct> AHA215459, LAW215133, SAL215736, MJB216145, TIS226328, red115…
$ age      <dbl> 44, 64, 41, 34, 58, 39, 48, 41, 34, 22, 38, 47, 33, 20, 51, 4…
$ hpt      <fct> no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, n…
$ smoking  <fct> still smoking, never smoked, never smoked, still smoking, nev…
$ dmdx     <fct> no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, n…
$ height   <dbl> 1.63, 1.59, 1.46, 1.52, 1.48, 1.53, 1.48, 1.52, 1.55, 1.65, 1…
$ weight   <dbl> 52.65, 55.30, 49.90, 48.70, 43.35, 60.10, 70.70, 63.65, 50.00…
$ waist    <dbl> 71.0, 83.5, 70.0, 75.0, 78.1, 83.0, 102.6, 90.0, 73.0, 75.0, …
$ hip      <dbl> 85.0, 89.9, 88.0, 86.7, 87.8, 93.5, 102.9, 104.0, 87.0, 90.0,…
$ msbpr    <dbl> 124.5, 157.5, 115.5, 111.0, 140.5, 160.0, 119.5, 150.0, 110.5…
$ mdbpr    <dbl> 73.5, 68.0, 64.5, 68.5, 68.5, 99.5, 76.0, 82.0, 73.5, 79.0, 7…
$ hba1c    <dbl> 4.2, 5.3, 5.0, 5.1, 5.2, 5.9, 5.5, 4.4, 5.1, 4.9, 5.3, 5.5, 5…
$ fbs      <dbl> 2.50, 2.50, 2.51, 2.52, 2.52, 2.52, 2.52, 2.52, 2.55, 2.55, 2…
$ mogtt1h  <dbl> 2.04, 9.99, 7.82, 5.67, 10.28, 11.44, 6.31, 9.00, 5.03, 4.30,…
$ mogtt2h  <dbl> 3.30, 8.44, 6.42, 5.45, 4.56, 6.95, 5.58, 7.07, 4.55, 5.34, 7…
$ totchol  <dbl> 4.18, 5.27, 5.10, 5.55, 6.72, 6.69, 7.28, 5.74, 3.48, 4.08, 4…
$ ftrigliz <dbl> 1.20, 1.28, 0.68, 0.86, 1.13, 1.59, 2.18, 1.24, 0.26, 0.96, 0…
$ hdl      <dbl> 1.07, 0.99, 1.47, 1.49, 1.46, 1.28, 1.20, 1.22, 1.04, 0.81, 1…
$ ldl      <dbl> 2.43, 3.64, 2.73, 3.56, 4.54, 4.08, 5.02, 3.70, 2.01, 2.34, 2…
$ gender   <fct> male, male, female, male, female, female, female, female, fem…
$ crural   <fct> rural, rural, rural, rural, urban, rural, rural, urban, rural…

5 Data Wrangling

The predictors that will be selected in this dataset are hba1c, triglycerides, age, waist circumference, smoking status, and gender.

datafbs.o <- datafbs.o %>%
  select(fbs, hba1c, ftrigliz, age, waist, smoking, gender)

# inspect the selected data
glimpse(datafbs.o)
Rows: 4,092
Columns: 7
$ fbs      <dbl> 2.50, 2.50, 2.51, 2.52, 2.52, 2.52, 2.52, 2.52, 2.55, 2.55, 2…
$ hba1c    <dbl> 4.2, 5.3, 5.0, 5.1, 5.2, 5.9, 5.5, 4.4, 5.1, 4.9, 5.3, 5.5, 5…
$ ftrigliz <dbl> 1.20, 1.28, 0.68, 0.86, 1.13, 1.59, 2.18, 1.24, 0.26, 0.96, 0…
$ age      <dbl> 44, 64, 41, 34, 58, 39, 48, 41, 34, 22, 38, 47, 33, 20, 51, 4…
$ waist    <dbl> 71.0, 83.5, 70.0, 75.0, 78.1, 83.0, 102.6, 90.0, 73.0, 75.0, …
$ smoking  <fct> still smoking, never smoked, never smoked, still smoking, nev…
$ gender   <fct> male, male, female, male, female, female, female, female, fem…
#confirm the reference group
levels(datafbs.o$smoking)
[1] "never smoked"    "quitted smoking" "still smoking"  
levels(datafbs.o$gender)
[1] "female" "male"  

5.1 Rename Column

datafbs.o <- datafbs.o %>%
  rename(
    fbs_raw = fbs,
    hba1c = hba1c,
    triglycerides = ftrigliz,
    age = age,
    waist_circumference = waist,
    smoking_status = smoking,
    sex = gender
  )

6 Create categorical outcome variable

We will use cut() function to create a categorical (factor) outcome variable. Variable fbs will be categorized into a 3 category variable renamed as cat_fbs.

Then we use the label() function to label the variable.

datafbs.o <- datafbs.o %>%
  mutate(cat_fbs = cut(fbs_raw, 
                   breaks = c(-Inf, 5.6, 7.0, Inf),
                   labels = c("normal", "prediabetes", "diabetes"),
                   right = FALSE))

Let us checked if we have grouped it correctly

# Summarizing the min and max of 'fbs_raw' for each 'cat_fbs' group
datafbs.o %>%
  select(cat_fbs, fbs_raw) %>%
  group_by(cat_fbs) %>%
  summarize(
    min_fbs_raw = min(fbs_raw, na.rm = TRUE),
    max_fbs_raw = max(fbs_raw, na.rm = TRUE)
  )

Next, we will used ordered function to create an ordinal variable and define the levels of the variable.

We will reverse the order of the outcome variable.

lev <- c('diabetes','prediabetes','normal')
lev
[1] "diabetes"    "prediabetes" "normal"     
datafbs.o <- datafbs.o %>% 
  mutate(cat_fbs1 = fct_relevel(cat_fbs, lev)) %>%
  mutate(cat_fbs1 = ordered(cat_fbs1, levels = lev))

Let us check if we have done correctly:

str(datafbs.o$cat_fbs1)
 Ord.factor w/ 3 levels "diabetes"<"prediabetes"<..: 3 3 3 3 3 3 3 3 3 3 ...
levels(datafbs.o$cat_fbs)
[1] "normal"      "prediabetes" "diabetes"   
table(datafbs.o$cat_fbs)

     normal prediabetes    diabetes 
       2640         889         563 
levels(datafbs.o$cat_fbs1)
[1] "diabetes"    "prediabetes" "normal"     
table(datafbs.o$cat_fbs1)

   diabetes prediabetes      normal 
        563         889        2640 

7 Exploratory Data Analysis

datafbs.o %>%
  select(-fbs_raw, -cat_fbs1) %>%  # remove unwanted variables
  tbl_summary(
    by = cat_fbs,
    type = list(
      smoking_status ~ "categorical",
      sex ~ "categorical"
    ),
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    digits = all_continuous() ~ 2,
    missing = "no"
  ) %>%
  add_overall() %>%
  bold_labels()
Characteristic Overall
N = 4,0921
normal
N = 2,6401
prediabetes
N = 8891
diabetes
N = 5631
hba1c 5.83 (1.46) 5.38 (0.67) 5.76 (0.96) 8.05 (2.46)
triglycerides 1.54 (1.07) 1.37 (0.88) 1.70 (1.10) 2.11 (1.50)
age 48.05 (14.48) 45.50 (14.67) 52.53 (13.35) 52.98 (12.09)
waist_circumference 86.44 (13.04) 84.47 (12.93) 89.21 (12.42) 91.31 (12.49)
smoking_status



    never smoked 3,136 (77%) 2,035 (77%) 669 (75%) 432 (77%)
    quitted smoking 322 (7.9%) 188 (7.1%) 80 (9.0%) 54 (9.6%)
    still smoking 634 (15%) 417 (16%) 140 (16%) 77 (14%)
sex



    female 2,664 (65%) 1,757 (67%) 554 (62%) 353 (63%)
    male 1,428 (35%) 883 (33%) 335 (38%) 210 (37%)
1 Mean (SD); n (%)

7.1 Plots

Use histogram for numerical predictors, while bar plot for categorical predictors.

  1. Hbaic

    # Histogram of HbA1c by FBS category
    p1 <- ggplot(datafbs.o, aes(x = hba1c, fill = cat_fbs)) +
      geom_histogram(binwidth = 0.2, position = "dodge", alpha = 0.7, color = "black") +
      facet_wrap(~ cat_fbs, scales = "free_y") +
      labs(
        title = "Distribution of HbA1c by FBS Category",
        x = "HbA1c (%)",
        y = "Count",
        fill = "FBS Category"
      ) +
      theme_minimal()
    
    p1

  2. Triglycerides

    # Histogram of Triglycerides by FBS category
    p2 <- ggplot(datafbs.o, aes(x = triglycerides, fill = cat_fbs)) +
      geom_histogram(binwidth = 0.2, position = "dodge", alpha = 0.7, color = "black") +
      facet_wrap(~ cat_fbs, scales = "free_y") +
      labs(
        title = "Distribution of Triglycerides by FBS Category",
        x = "triglycerides levels",
        y = "Count",
        fill = "FBS Category"
      ) +
      theme_minimal()
    
    p2

  3. Age

    # Histogram of age by FBS category
    p3 <- ggplot(datafbs.o, aes(x = age, fill = cat_fbs)) +
      geom_histogram(binwidth = 0.2, position = "dodge", alpha = 0.7, color = "black") +
      facet_wrap(~ cat_fbs, scales = "free_y") +
      labs(
        title = "Distribution of Age by FBS Category",
        x = "Age",
        y = "Count",
        fill = "FBS Category"
      ) +
      theme_minimal()
    
    p3

  4. Waist Circumference

    # Histogram of Waist Circumference by FBS category
    p4 <- ggplot(datafbs.o, aes(x = waist_circumference, fill = cat_fbs)) +
      geom_histogram(binwidth = 0.2, position = "dodge", alpha = 0.7, color = "black") +
      facet_wrap(~ cat_fbs, scales = "free_y") +
      labs(
        title = "Distribution of Waist Circumference by FBS Category",
        x = "Waist Circumference",
        y = "Count",
        fill = "FBS Category"
      ) +
      theme_minimal()
    
    p4

  5. Smoking Status

    # Bar plot of smoking status by FBS category
    p5 <- ggplot(datafbs.o, aes(x = cat_fbs, fill = smoking_status)) +
      geom_bar(position = "dodge", alpha = 0.7) +
      labs(title = "Distribution of Smoking Status by FBS Category",
           x = "FBS Category",
           y = "Count",
           fill = "Smoking Status") +
      theme_minimal()
    p5

  6. Sex

    # Bar plot of sex by FBS category
    p6 <- ggplot(datafbs.o, aes(x = cat_fbs, fill = sex)) +
      geom_bar(position = "dodge", alpha = 0.7) +
      labs(title = "Distribution of Sex by FBS Category",
           x = "FBS Category",
           y = "Count",
           fill = "Sex") +
      theme_minimal()
    p6

8 Estimation

8.1 Adjacent-category model or multinomial or baseline logit model

Here, we will show how to replicate the baseline logit model (unconstrained). We could not reproduce the adjancent-category models that is based on constrained logit models.

Generate a new variable but with no ordering.

datafbs.o <- 
  datafbs.o %>% 
  mutate(cat_fbs2 = fct_relevel(cat_fbs, lev))

Run the mlogit

mlogit1 <- 
  vglm(cat_fbs ~ smoking_status, multinomial, data = datafbs.o)
summary(mlogit1)

Call:
vglm(formula = cat_fbs ~ smoking_status, family = multinomial, 
    data = datafbs.o)

Coefficients: 
                                Estimate Std. Error z value Pr(>|z|)    
(Intercept):1                    1.54983    0.05297  29.256  < 2e-16 ***
(Intercept):2                    0.43736    0.06172   7.086 1.38e-12 ***
smoking_statusquitted smoking:1 -0.30237    0.16323  -1.852    0.064 .  
smoking_statusquitted smoking:2 -0.04432    0.18662  -0.237    0.812    
smoking_statusstill smoking:1    0.13946    0.13488   1.034    0.301    
smoking_statusstill smoking:2    0.16048    0.15472   1.037    0.300    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])

Residual deviance: 7254.844 on 8178 degrees of freedom

Log-likelihood: -3627.422 on 8178 degrees of freedom

Number of Fisher scoring iterations: 4 

No Hauck-Donner effect found in any of the estimates


Reference group is level  3  of the response

8.1.1 Calculate Confidence interval for the model

conf_intervals <- confint(mlogit1)
print(conf_intervals)
                                     2.5 %     97.5 %
(Intercept):1                    1.4459984 1.65365264
(Intercept):2                    0.3163864 0.55833059
smoking_statusquitted smoking:1 -0.6222921 0.01755696
smoking_statusquitted smoking:2 -0.4100896 0.32145785
smoking_statusstill smoking:1   -0.1248981 0.40380863
smoking_statusstill smoking:2   -0.1427738 0.46373083

8.1.2 Get the RRR

exp(coef(mlogit1))
                  (Intercept):1                   (Intercept):2 
                      4.7106481                       1.5486111 
smoking_statusquitted smoking:1 smoking_statusquitted smoking:2 
                      0.7390663                       0.9566517 
  smoking_statusstill smoking:1   smoking_statusstill smoking:2 
                      1.1496474                       1.1740726 

Build final table using provided values.

# Define function to compute 95% CI for RRR
calc_ci_rrr <- function(estimate, stderr) {
  lower <- estimate - 1.96 * stderr
  upper <- estimate + 1.96 * stderr
  return(c(exp(lower), exp(upper)))
}

# Input full results manually
coefficients <- data.frame(
  Term = c(
    "(Intercept):1", "(Intercept):2",
    "smoking_statusquitted smoking:1", "smoking_statusquitted smoking:2",
    "smoking_statusstill smoking:1", "smoking_statusstill smoking:2"
  ),
  Estimate = c(
    1.54983, 0.43736,
    -0.30237, -0.04432,
    0.13946,  0.16048
  ),
  StdError = c(
    0.05297, 0.06172,
    0.16323, 0.18662,
    0.13488, 0.15472
  ),
  ZValue = c(
    29.256, 7.086,
    -1.852, -0.237,
    1.034, 1.037
  ),
  PValue = c(
    "<2e-16", "1.38e-12",
    "0.064", "0.812",
    "0.301", "0.300"
  )
)

# Calculate RRR and 95% CI for RRR
coefficients <- coefficients %>%
  mutate(
    RRR = exp(Estimate),
    CI_RRR = map2(Estimate, StdError, calc_ci_rrr)
  ) %>%
  unnest_wider(CI_RRR, names_sep = "_") %>%
  rename(
    CI_RRR_Lower = CI_RRR_1,
    CI_RRR_Upper = CI_RRR_2
  )

# Output the table
kable(coefficients, digits = 3, caption = "Combined Coefficients and RRR Table with Confidence Intervals")
Combined Coefficients and RRR Table with Confidence Intervals
Term Estimate StdError ZValue PValue RRR CI_RRR_Lower CI_RRR_Upper
(Intercept):1 1.550 0.053 29.256 <2e-16 4.711 4.246 5.226
(Intercept):2 0.437 0.062 7.086 1.38e-12 1.549 1.372 1.748
smoking_statusquitted smoking:1 -0.302 0.163 -1.852 0.064 0.739 0.537 1.018
smoking_statusquitted smoking:2 -0.044 0.187 -0.237 0.812 0.957 0.664 1.379
smoking_statusstill smoking:1 0.139 0.135 1.034 0.301 1.150 0.883 1.498
smoking_statusstill smoking:2 0.160 0.155 1.037 0.300 1.174 0.867 1.590

8.2 Continuation-ratio

To perform a continuation-ratio analysis for the cat_fbs variable (categorized as normal, prediabetes, and diabetes), we need to create two sequential binary comparisons that reflect the forward progression of glycemic status:

  1. compare cat_fbs=diabetes vs cat_fbs=prediabetes.

    This assesses the odds of progressing to diabetes among those already beyond the normal stage.

  2. compare cat_fbs=diabetes and prediabetes vs cat_fbs=normal.

    This evaluates the odds of having progressed beyond the normal glycemic state into either prediabetes or diabetes.

8.2.0.1 1. compare cat_fbs=diabetes vs cat_fbs=prediabetes

table(datafbs.o$cat_fbs) ; table(datafbs.o$cat_fbs1)

     normal prediabetes    diabetes 
       2640         889         563 

   diabetes prediabetes      normal 
        563         889        2640 
datafbs.o1 <- 
  datafbs.o %>% 
  filter(cat_fbs == 'diabetes' | cat_fbs == 'prediabetes')
cr1 <- glm(cat_fbs1 ~ smoking_status, family = binomial(link ='logit'),
           data = datafbs.o1)
summary(cr1)

Call:
glm(formula = cat_fbs1 ~ smoking_status, family = binomial(link = "logit"), 
    data = datafbs.o1)

Coefficients:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    0.43736    0.06172   7.086 1.38e-12 ***
smoking_statusquitted smoking -0.04432    0.18662  -0.237    0.812    
smoking_statusstill smoking    0.16048    0.15472   1.037    0.300    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1939.1  on 1451  degrees of freedom
Residual deviance: 1937.8  on 1449  degrees of freedom
AIC: 1943.8

Number of Fisher Scoring iterations: 4

8.2.0.2 2. compare cat_fbs=diabetes and prediabetes vs cat_fbs=normal

For this we will recode variable fbs using ifelse function. Please, make sure the code will be 0 and 1. using code bigger will not make glm function work

datafbs.o2 <- datafbs.o %>% 
  filter(cat_fbs == 'diabetes' | 
         cat_fbs == 'prediabetes'| 
         cat_fbs == 'normal')
table(datafbs.o2$cat_fbs1)

   diabetes prediabetes      normal 
        563         889        2640 
datafbs.o2 <- datafbs.o2 %>% 
  mutate(cat_fbs2 = ifelse(cat_fbs1 == "diabetes", 0, 
                        ifelse(cat_fbs1 == "prediabetes",0,1)))
table(datafbs.o2$cat_fbs1) ; table(datafbs.o2$cat_fbs2)

   diabetes prediabetes      normal 
        563         889        2640 

   0    1 
1452 2640 

And run the next CR model:

cr2 <- 
  glm(cat_fbs2 ~ smoking_status, family = binomial(link ='logit'), 
           data = datafbs.o2)
summary(cr2)

Call:
glm(formula = cat_fbs2 ~ smoking_status, family = binomial(link = "logit"), 
    data = datafbs.o2)

Coefficients:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    0.61428    0.03741  16.419   <2e-16 ***
smoking_statusquitted smoking -0.27567    0.11909  -2.315   0.0206 *  
smoking_statusstill smoking    0.03891    0.09168   0.424   0.6713    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 5322.8  on 4091  degrees of freedom
Residual deviance: 5317.0  on 4089  degrees of freedom
AIC: 5323

Number of Fisher Scoring iterations: 4
# Function to calculate confidence intervals
calc_ci <- function(estimate, stderr) {
  lower <- estimate - 1.96 * stderr
  upper <- estimate + 1.96 * stderr
  return(c(lower, upper))
}

# Create cr1 results
cr1_results <- data.frame(
  Model = "cr1",
  Term = c("(Intercept)", "smoking_statusquitted smoking", "smoking_statusstill smoking"),
  Estimate = c(0.43736, -0.04432, 0.16048),
  StdError = c(0.06172, 0.18662, 0.15472),
  ZValue = c(7.086, -0.237, 1.037),
  PValue = c(1.38e-12, 0.812, 0.300)
) %>%
  mutate(CI = purrr::map2(Estimate, StdError, calc_ci)) %>%
  tidyr::unnest_wider(CI, names_sep = "_")

# Create cr2 results
cr2_results <- data.frame(
  Model = "cr2",
  Term = c("(Intercept)", "smoking_statusquitted smoking", "smoking_statusstill smoking"),
  Estimate = c(0.61428, -0.27567, 0.03891),
  StdError = c(0.03741, 0.11909, 0.09168),
  ZValue = c(16.419, -2.315, 0.424),
  PValue = c(2e-16, 0.0206, 0.6713)
) %>%
  mutate(CI = purrr::map2(Estimate, StdError, calc_ci)) %>%
  tidyr::unnest_wider(CI, names_sep = "_")

# Combine the data frames
combined_results <- bind_rows(cr1_results, cr2_results) %>%
  rename(CI_Lower = CI_1, CI_Upper = CI_2)

# Print the combined table in a tidy format
kable(combined_results, format = "pipe", caption = "Combined Logistic Regression Results with Confidence Intervals")
Combined Logistic Regression Results with Confidence Intervals
Model Term Estimate StdError ZValue PValue CI_Lower CI_Upper
cr1 (Intercept) 0.43736 0.06172 7.086 0.0000 0.3163888 0.5583312
cr1 smoking_statusquitted smoking -0.04432 0.18662 -0.237 0.8120 -0.4100952 0.3214552
cr1 smoking_statusstill smoking 0.16048 0.15472 1.037 0.3000 -0.1427712 0.4637312
cr2 (Intercept) 0.61428 0.03741 16.419 0.0000 0.5409564 0.6876036
cr2 smoking_statusquitted smoking -0.27567 0.11909 -2.315 0.0206 -0.5090864 -0.0422536
cr2 smoking_statusstill smoking 0.03891 0.09168 0.424 0.6713 -0.1407828 0.2186028

8.2.0.3 3. Conclusion

Model 1 Comparison: Diabetes (vs. Prediabetes)

Among individuals with elevated fasting blood sugar levels (diabetes or prediabetes), the odds of being in the diabetes group rather than prediabetes group were:

  • 1.17 times higher for those who still smoke compared to non-smokers (OR = exp(0.16048) = 1.17; p = 0.300).

  • 0.96 times lower (or 4% lower odds) for those who had quit smoking compared to non-smokers (OR = exp(-0.04432) = 0.96; p = 0.812).

However, both associations were not statistically significant.

Model 2 Comparison: Diabetes and Prediabetes (vs. Normal)

When comparing those with any elevated FBS (diabetes or prediabetes) to those with normal levels, the odds of being in the elevated group were:

  • 0.76 times lower (or 24% reduced odds) among former smokers compared to non-smokers (OR = exp(-0.27567) = 0.76; p = 0.021) — this was statistically significant.

  • 1.04 times higher for current smokers, but the association was not statistically significant (OR = exp(0.03891) = 1.04; p = 0.671).

The estimates from the continuation-ratio models are relatively consistent across the comparisons, especially for the effect of smoking. In general, former smokers were less likely to fall into worse FBS categories, while current smoking status did not show a strong or consistent effect. The significant association in the second model indicates that quitting smoking may reduce the odds of having elevated fasting blood sugar compared to maintaining normal levels.

9 Prediction

9.1 Predicted probability

Use polr package

polr in MASS package

polr_cr3 <- MASS::polr(cat_fbs ~ age, data = datafbs.o, Hess = TRUE)
summary(polr_cr3)
Call:
MASS::polr(formula = cat_fbs ~ age, data = datafbs.o, Hess = TRUE)

Coefficients:
      Value Std. Error t value
age 0.03429   0.002334   14.69

Intercepts:
                     Value   Std. Error t value
normal|prediabetes    2.2862  0.1223    18.6862
prediabetes|diabetes  3.5736  0.1300    27.4844

Residual Deviance: 7032.948 
AIC: 7038.948 

Then the probabilities are:

prob_polr <- predict(polr_cr3, type = 'probs')
head(prob_polr) ; tail(prob_polr)
     normal prediabetes   diabetes
1 0.6850904   0.2023296 0.11258001
2 0.5228294   0.2759640 0.20120661
3 0.7068529   0.1904433 0.10270380
4 0.7540272   0.1633776 0.08259527
5 0.5737448   0.2561011 0.17015413
6 0.7208610   0.1825860 0.09655301
        normal prediabetes   diabetes
4087 0.6924417   0.1983592 0.10919907
4088 0.6391092   0.2260582 0.13483260
4089 0.6850904   0.2023296 0.11258001
4090 0.6069055   0.2414437 0.15165076
4091 0.4799882   0.2898325 0.23017931
4092 0.7277091   0.1786883 0.09360262
# Create a new data frame with age, smoke, and gender
newdat <- cbind(datafbs.o[, c("age", "smoking_status","sex")], prob_polr)

# Reshape the data to long format
lnewdat <- melt(newdat, id.vars = c("age", "smoking_status", "sex"), 
                variable.name = "fbs_category", value.name = "Probability")

# View the first few rows
head(lnewdat)
ggplot(lnewdat, aes(x = age, y = Probability, colour = fbs_category)) +
  geom_line() + facet_grid(sex ~ smoking_status, labeller="label_both")

9.2 Manual calculation for prediction

summary(o.age)
formula: cat_fbs ~ age
data:    datafbs.o

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  4092 -3516.47 7038.95 5(0)  1.49e-10 8.1e+04

Coefficients:
    Estimate Std. Error z value Pr(>|z|)    
age 0.034294   0.002334    14.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
                     Estimate Std. Error z value
normal|prediabetes     2.2862     0.1223   18.69
prediabetes|diabetes   3.5736     0.1300   27.49

And we want to predict these data

To predict the probabilities of i-th falls into each category for model for a new data, we need to create a new data first. We will use expand.grid() to create a new data of class data.frame

This is the new data

newData <- expand.grid(age = unique(datafbs.o$age))
head(newData)

And now the predictions

lp.o.age <- predict(o.age, newdata = newData, type = 'linear.predictor')
lp.o.age
$eta1
        normal prediabetes diabetes
1   0.77726489   2.0646541 99998.49
2   0.09138080   1.3787700 99997.81
3   0.88014751   2.1675367 99998.59
4   1.12020694   2.4075961 99998.83
5   0.29714603   1.5845352 99998.01
6   0.94873592   2.2361251 99998.66
7   0.64008808   1.9274773 99998.35
8   1.53173740   2.8191266 99999.25
9   0.98303012   2.2704193 99998.70
10  0.67438228   1.9617715 99998.39
11  1.15450115   2.4418903 99998.87
12  1.60032581   2.8877150 99999.31
13  0.53720546   1.8245946 99998.25
14  0.74297069   2.0303599 99998.46
15  1.01732433   2.3047135 99998.73
16  0.40002864   1.6874178 99998.11
17  0.84585330   2.1332425 99998.56
18  0.26285182   1.5502410 99997.98
19  0.70867648   1.9960657 99998.42
20  1.56603160   2.8534208 99999.28
21  0.02279239   1.3101816 99997.74
22  1.63462001   2.9220092 99999.35
23  1.18879535   2.4761845 99998.90
24  1.05161853   2.3390077 99998.77
25  1.08591274   2.3733019 99998.80
26  0.57149967   1.8588888 99998.29
27  1.22308956   2.5104787 99998.94
28  0.91444171   2.2018309 99998.63
29  1.25738376   2.5447729 99998.97
30  1.39456058   2.6819498 99999.11
31  0.33144023   1.6188294 99998.05
32  1.42885478   2.7162440 99999.14
33  1.49744319   2.7848324 99999.21
34  0.46861705   1.7560062 99998.18
35  1.32597217   2.6133613 99999.04
36  1.46314899   2.7505382 99999.18
37 -0.18297284   1.1044163 99997.53
38  0.60579387   1.8931830 99998.32
39 -0.11438443   1.1730048 99997.60
40  0.15996921   1.4473584 99997.87
41 -0.01150181   1.2758874 99997.70
42 -0.25156125   1.0358279 99997.46
43 -0.08009022   1.2072990 99997.63
44  1.29167796   2.5790671 99999.01
45 -0.04579602   1.2415932 99997.67
46  0.36573444   1.6531236 99998.08
47  1.66891422   2.9563034 99999.38
48  0.81155910   2.0989483 99998.53
49  1.36026637   2.6476556 99999.07
50  0.50291126   1.7903004 99998.22
51 -0.14867863   1.1387105 99997.57
52  0.12567501   1.4130642 99997.84
53 -0.52591488   0.7614743 99997.19
54 -0.28585545   1.0015337 99997.43
55  0.43432285   1.7217120 99998.15
56 -0.32014966   0.9672395 99997.39
57  0.19426341   1.4816526 99997.91
58 -0.45732647   0.8300627 99997.26
59 -0.38873806   0.8986511 99997.33
60  0.05708660   1.3444758 99997.77
61  0.22855762   1.5159468 99997.94
62 -0.21726704   1.0701221 99997.50
63 -0.62879750   0.6585917 99997.08
64 -0.42303227   0.8643569 99997.29
65 -0.49162068   0.7957685 99997.22
66 -0.59450329   0.6928859 99997.12
67 -0.35444386   0.9329453 99997.36
68 -0.76597432   0.5214149 99996.95
69 -0.66309170   0.6242975 99997.05
70 -0.56020909   0.7271801 99997.15
71 -0.73168011   0.5557091 99996.98

$eta2
      normal prediabetes  diabetes
1  -100001.5  0.77726489 2.0646541
2  -100002.2  0.09138080 1.3787700
3  -100001.4  0.88014751 2.1675367
4  -100001.2  1.12020694 2.4075961
5  -100002.0  0.29714603 1.5845352
6  -100001.3  0.94873592 2.2361251
7  -100001.6  0.64008808 1.9274773
8  -100000.8  1.53173740 2.8191266
9  -100001.3  0.98303012 2.2704193
10 -100001.6  0.67438228 1.9617715
11 -100001.1  1.15450115 2.4418903
12 -100000.7  1.60032581 2.8877150
13 -100001.7  0.53720546 1.8245946
14 -100001.5  0.74297069 2.0303599
15 -100001.3  1.01732433 2.3047135
16 -100001.9  0.40002864 1.6874178
17 -100001.4  0.84585330 2.1332425
18 -100002.0  0.26285182 1.5502410
19 -100001.6  0.70867648 1.9960657
20 -100000.7  1.56603160 2.8534208
21 -100002.3  0.02279239 1.3101816
22 -100000.7  1.63462001 2.9220092
23 -100001.1  1.18879535 2.4761845
24 -100001.2  1.05161853 2.3390077
25 -100001.2  1.08591274 2.3733019
26 -100001.7  0.57149967 1.8588888
27 -100001.1  1.22308956 2.5104787
28 -100001.4  0.91444171 2.2018309
29 -100001.0  1.25738376 2.5447729
30 -100000.9  1.39456058 2.6819498
31 -100002.0  0.33144023 1.6188294
32 -100000.9  1.42885478 2.7162440
33 -100000.8  1.49744319 2.7848324
34 -100001.8  0.46861705 1.7560062
35 -100001.0  1.32597217 2.6133613
36 -100000.8  1.46314899 2.7505382
37 -100002.5 -0.18297284 1.1044163
38 -100001.7  0.60579387 1.8931830
39 -100002.4 -0.11438443 1.1730048
40 -100002.1  0.15996921 1.4473584
41 -100002.3 -0.01150181 1.2758874
42 -100002.5 -0.25156125 1.0358279
43 -100002.4 -0.08009022 1.2072990
44 -100001.0  1.29167796 2.5790671
45 -100002.3 -0.04579602 1.2415932
46 -100001.9  0.36573444 1.6531236
47 -100000.6  1.66891422 2.9563034
48 -100001.5  0.81155910 2.0989483
49 -100000.9  1.36026637 2.6476556
50 -100001.8  0.50291126 1.7903004
51 -100002.4 -0.14867863 1.1387105
52 -100002.2  0.12567501 1.4130642
53 -100002.8 -0.52591488 0.7614743
54 -100002.6 -0.28585545 1.0015337
55 -100001.9  0.43432285 1.7217120
56 -100002.6 -0.32014966 0.9672395
57 -100002.1  0.19426341 1.4816526
58 -100002.7 -0.45732647 0.8300627
59 -100002.7 -0.38873806 0.8986511
60 -100002.2  0.05708660 1.3444758
61 -100002.1  0.22855762 1.5159468
62 -100002.5 -0.21726704 1.0701221
63 -100002.9 -0.62879750 0.6585917
64 -100002.7 -0.42303227 0.8643569
65 -100002.8 -0.49162068 0.7957685
66 -100002.9 -0.59450329 0.6928859
67 -100002.6 -0.35444386 0.9329453
68 -100003.1 -0.76597432 0.5214149
69 -100002.9 -0.66309170 0.6242975
70 -100002.8 -0.56020909 0.7271801
71 -100003.0 -0.73168011 0.5557091

The coefficients for model o.age

coef.o.age<- coef(o.age)
coef.o.age
  normal|prediabetes prediabetes|diabetes                  age 
           2.2862099            3.5735991            0.0342942 
age_value <- 44
lp.o1.bx <- coef.o.age['age'] * age_value

Putting them inside the equation

lp.o1.bx <- 0.0342942 * 44
lp.o1.bx <- 1.5089448

And complete the Eq 8.25 in Hosmer

logit1 <- coef.o.age['normal|prediabetes'] - lp.o1.bx
logit2 <- coef.o.age['prediabetes|diabetes'] - lp.o1.bx

Then the probabilities are

pLeq1 <- 1 / (1 + exp(-logit1))  # p(Y <= first threshold)
pLeq2 <- 1 / (1 + exp(-logit2))  # p(Y <= second threshold)

pLeq1 <- 1 / (1 + exp(-0.7772651))  # = 0.6850903
pLeq2 <- 1 / (1 + exp(-2.0646543))  # = 0.8874200

pMat <- cbind(
  p1 = pLeq1,                   # Probability of being in the first category
  p2 = pLeq2 - pLeq1,           # Probability of being in the second category
  p3 = 1 - pLeq2                # Probability of being in the third category
)

pMat
            p1        p2      p3
[1,] 0.6850904 0.2023296 0.11258

Let us confirm with the prediction made by clm()

predict(o.age, newdata = newData, type = 'prob')
$fit
      normal prediabetes   diabetes
1  0.6850903   0.2023296 0.11258002
2  0.5228293   0.2759641 0.20120662
3  0.7068528   0.1904434 0.10270382
4  0.7540271   0.1633776 0.08259529
5  0.5737447   0.2561012 0.17015414
6  0.7208609   0.1825861 0.09655302
7  0.6547734   0.2181966 0.12703007
8  0.8222604   0.1214403 0.05629932
9  0.7277090   0.1786883 0.09360263
10 0.6624837   0.2142408 0.12327546
11 0.7603321   0.1596343 0.08003362
12 0.8320639   0.1151719 0.05276421
13 0.6311621   0.2299544 0.13888347
14 0.6776451   0.2063029 0.11605200
15 0.7344511   0.1748156 0.09073334
16 0.5986945   0.2451897 0.15611572
17 0.6996966   0.1943959 0.10590756
18 0.5653372   0.2596113 0.17505146
19 0.6701086   0.2102747 0.11961662
20 0.8272171   0.1182781 0.05450476
21 0.5056979   0.2818457 0.21245646
22 0.8368016   0.1121222 0.05107623
23 0.7665255   0.1559298 0.07754469
24 0.7410856   0.1709709 0.08794347
25 0.7476113   0.1671574 0.08523135
26 0.6391091   0.2260582 0.13483262
27 0.7726068   0.1522664 0.07512684
28 0.7139082   0.1865056 0.09958619
29 0.7785754   0.1486462 0.07277843
30 0.8013193   0.1346338 0.06404690
31 0.5821098   0.2525239 0.16536637
32 0.8067228   0.1312556 0.06202161
33 0.8171928   0.1246578 0.05814933
34 0.6150564   0.2376524 0.14729124
35 0.7901736   0.1415430 0.06828344
36 0.8120138   0.1279299 0.06005626
37 0.4543840   0.2967027 0.24891332
38 0.6469807   0.2221373 0.13088197
39 0.4714350   0.2922527 0.23631229
40 0.5399072   0.2696843 0.19040845
41 0.4971246   0.2846243 0.21825110
42 0.4374393   0.3006049 0.26195580
43 0.4799881   0.2898325 0.23017932
44 0.7844311   0.1450711 0.07049783
45 0.4885530   0.2872882 0.22415879
46 0.5904279   0.2488849 0.16068723
47 0.8414310   0.1091296 0.04943944
48 0.6924416   0.1983593 0.10919909
49 0.7958030   0.1380634 0.06613366
50 0.6231432   0.2338209 0.14303589
51 0.4628987   0.2945442 0.24255718
52 0.5313775   0.2728713 0.19575120
53 0.3714702   0.3102036 0.31832627
54 0.4290188   0.3023412 0.26863998
55 0.6069055   0.2414438 0.15165078
56 0.4206393   0.3039297 0.27543107
57 0.5484137   0.2664084 0.18517793
58 0.3876202   0.3087479 0.30363181
59 0.4040211   0.3066511 0.28932777
60 0.5142678   0.2789572 0.20677498
61 0.5568920   0.2630489 0.18005915
62 0.4458959   0.2987242 0.25537986
63 0.3477833   0.3111607 0.34105604
64 0.3957914   0.3077788 0.29642987
65 0.3795119   0.3095567 0.31093140
66 0.3556023   0.3110063 0.33339140
67 0.4123052   0.3053672 0.28232756
68 0.3173506   0.3101280 0.37252145
69 0.3400454   0.3111499 0.34880469
70 0.3634991   0.3106871 0.32581384
71 0.3248261   0.3106330 0.36454088

9.3 Checking proportional odds assumption

brant(polr_cr3)
-------------------------------------------- 
Test for    X2  df  probability 
-------------------------------------------- 
Omnibus     8.34    1   0
age     8.34    1   0
-------------------------------------------- 

H0: Parallel Regression Assumption holds

10 Results

datafbs.o %>%
  select(-fbs_raw, -cat_fbs1, -cat_fbs2) %>%  # remove unwanted variables
  tbl_summary(
    by = cat_fbs,
    type = list(
      smoking_status ~ "categorical",
      sex ~ "categorical"
    ),
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    digits = all_continuous() ~ 2,
    missing = "no"
  ) %>%
  add_overall() %>%
  bold_labels() %>%
  modify_caption("**Table 1: Characteristics of FBS Categories**")
Table 1: Characteristics of FBS Categories
Characteristic Overall
N = 4,0921
normal
N = 2,6401
prediabetes
N = 8891
diabetes
N = 5631
hba1c 5.83 (1.46) 5.38 (0.67) 5.76 (0.96) 8.05 (2.46)
triglycerides 1.54 (1.07) 1.37 (0.88) 1.70 (1.10) 2.11 (1.50)
age 48.05 (14.48) 45.50 (14.67) 52.53 (13.35) 52.98 (12.09)
waist_circumference 86.44 (13.04) 84.47 (12.93) 89.21 (12.42) 91.31 (12.49)
smoking_status



    never smoked 3,136 (77%) 2,035 (77%) 669 (75%) 432 (77%)
    quitted smoking 322 (7.9%) 188 (7.1%) 80 (9.0%) 54 (9.6%)
    still smoking 634 (15%) 417 (16%) 140 (16%) 77 (14%)
sex



    female 2,664 (65%) 1,757 (67%) 554 (62%) 353 (63%)
    male 1,428 (35%) 883 (33%) 335 (38%) 210 (37%)
1 Mean (SD); n (%)
data <- data.frame(
  Smoking_Status = c("Quitted Smoking", "Still Smoking"),
  Coefficient_normal_vs_diabetes = c(-0.30237, 0.13946),
  SE_normal_vs_diabetes = c(0.16323, 0.13488),
  p_value_normal_vs_diabetes = c(0.064, 0.301),
  CI_normal_vs_diabetes = c("(-0.62, 0.02)", "(-0.12, 0.40)"),
  Coefficient_prediabetes_vs_diabetes = c(-0.04432, 0.16048),
  SE_prediabetes_vs_diabetes = c(0.18662, 0.15472),
  p_value_prediabetes_vs_diabetes = c(0.812, 0.300),
  CI_prediabetes_vs_diabetes = c("(-0.41, 0.32)", "(-0.14, 0.46)")
)

# Display the table
kable(data,
      col.names = c("Smoking Status", 
                    "Coefficient", "SE", "p-value", "95% CI",
                    "Coefficient", "SE", "p-value", "95% CI"), 
      caption = "Table 2: Adjacent-category model or multinomial or baseline logit model") %>%
  kable_styling(full_width = FALSE, position = "center") %>%
  add_header_above(c(" " = 1, 
                     "Normal vs Diabetes" = 4, 
                     "Prediabetes vs Diabetes" = 4)) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(0, bold = TRUE)
Table 2: Adjacent-category model or multinomial or baseline logit model
Normal vs Diabetes
Prediabetes vs Diabetes
Smoking Status Coefficient SE p-value 95% CI Coefficient SE p-value 95% CI
Quitted Smoking -0.30237 0.16323 0.064 (-0.62, 0.02) -0.04432 0.18662 0.812 (-0.41, 0.32)
Still Smoking 0.13946 0.13488 0.301 (-0.12, 0.40) 0.16048 0.15472 0.300 (-0.14, 0.46)
ggplot(lnewdat, aes(x = age, y = Probability, colour = fbs_category)) +
  geom_line() + facet_grid(sex ~ smoking_status, labeller="label_both")

11 Intepretation

11.1 1. Adjacent-category model or multinomial or baseline logit model

11.1.1 Comparison of “Normal” vs. “Diabetes”

When compared to individuals who never smoked, the log odds of being in the “Normal” blood sugar category (vs. “Diabetes”) for those who quitted smoking changes by –0.30 (95% CI: –0.62 to 0.02, p-value = 0.064).
This suggests that those who quitted smoking have lower odds of being in the “Normal” group compared to the “Diabetes” group, but the association is not statistically significant. The confidence interval includes 0, indicating that the true effect could be null or even in the opposite direction.

For individuals who are still smoking, the log odds change by 0.14 (95% CI: –0.12 to 0.40, p-value = 0.301).
This indicates that current smokers have slightly higher odds of being in the “Normal” group relative to the “Diabetes” group compared to never smokers, but again, the result is not statistically significant, and the confidence interval includes 0.

11.1.2 Comparison of “Prediabetes” vs. “Diabetes”

When compared to individuals who never smoked, the log odds of being in the “Prediabetes” category (vs. “Diabetes”) for those who quitted smoking changes by –0.04 (95% CI: –0.41 to 0.32, p-value = 0.812).
This indicates no meaningful difference in odds between quitted smokers and never smokers for being in the “Prediabetes” vs. “Diabetes” category. The estimate is very close to 0 and the confidence interval is wide, suggesting high uncertainty.

For individuals who are still smoking, the log odds change by 0.16 (95% CI: –0.14 to 0.46, p-value = 0.300).
This indicates no significant association between current smoking and the odds of being in the “Prediabetes” category versus the “Diabetes” category when compared to never smokers. The confidence interval spans both negative and positive values.

11.2 2. Plot of Predicted Probabilities of fbs category by Age Faceted by Smoking status and gender

Age Effect

Across all panels, as age increases, the probability of being in the “normal” fasting blood sugar category (red line) decreases steadily, while the probability of being in the “prediabetes” (green) and “diabetes” (blue) categories increases.
This trend is consistent for both males and females, indicating that older age is associated with higher likelihood of abnormal fasting blood sugar levels, regardless of smoking status.

Smoking Status Effect

  • Never Smoked (left column):
    Individuals who never smoked consistently show the highest probabilities of being in the “normal” category and the lowest probabilities of diabetes, especially at younger ages. This suggests a protective pattern for never smokers across the lifespan.

  • Quitted Smoking (middle column):
    Compared to never smokers, individuals who quitted smoking show a modestly lower probability of remaining in the normal range and higher probabilities of prediabetes and diabetes, especially at older ages. The gap between normal and abnormal categories narrows faster with age than in never smokers.

  • Still Smoking (right column):
    Current smokers exhibit the lowest probabilities of being in the normal category across all ages. The probability of diabetes increases more sharply with age, particularly for males, suggesting higher long-term metabolic risk among current smokers.

Sex Differences

  • Across all smoking categories, females tend to maintain higher probabilities of being in the normal category and lower probabilities of diabetes compared to males.

  • Males show a more pronounced age-related decline in the probability of normal blood sugar and a steeper increase in diabetes probability.